---
dataset_name: 10k_diabetes.xlsx
expiration_date: 10-10-2024
owner: izzy@datarobot.com
domain: trust-explainable-ai
title: Understand the Word Cloud
description: This tutorial provides instructions to access and understand the Word Cloud insight.
url: https://docs.datarobot.com/en/tutorials/explore-ai-insights/tut-wordcloud.html

---

# Understand the Word Cloud {: #understand-the-word-cloud }

In this tutorial, you'll learn how to draw insights from text features in models using the Word Cloud.

To access the Word Cloud, select a model on the Leaderboard and click **Understand** > **Word Cloud**.

## Takeaways {: #takeaways }

This tutorial explains:

- How to access and interpret the Word Cloud
- How to export the Word Cloud as raw values

## Access the Word Cloud {: #access-the-word-cloud }

When a training dataset contains one or more text features, DataRobot specially trains models to generate text-based insights, including the Word Cloud. If a dataset has multiple text features, a Word Cloud is created for each one.

1. Select a model that supports Word Clouds on the Leaderboard, for example the **Auto-Tuned Word N-Gram Text Modeler**. (See the [**Word Cloud**](word-cloud) description for other model types.)

    ![Select text-enabled model](images/tut-wordcloud-2.png)

    ??? tip "Search tip for finding models"
        To narrow down Leaderboard results, enter "insights" into the search bar at the top to quickly find models that produce a visualization on the [**Insights**](analyze-insights) page.

2. Click **Understand** > **Word Cloud**.

    ![Click Word Cloud tab](images/tut-wordcloud-3.png)


## Interpret the Word Cloud {: #interpret-the-word-cloud }

After selecting **Word Cloud**, the window displays a visualization of the model's top 200 text features chosen based on their relationship to the target feature.

1. Mouse over a word. The active word displays in the upper-left corner.

    ![Hover over word](images/tut-wordcloud-5.png)

    ??? tip "Stop words"
        To prevent common stop words (the, for, was, etc.) from appearing in the Word Cloud, select the box next to **Filter stop words**.

2. Look at the size of the word. Size represents the frequency of the word in the dataset&mdash;larger words appear more frequently than smaller words.

    ![Word size](images/tut-wordcloud-4.png)

3. Look at the color of the word. Color represents how closely related the word is to the target feature&mdash;red indicates a positive effect on the target feature and blue indicates a negative effect on the target feature.

    ![Word color](images/tut-wordcloud-6.png)

4. Look at the [coefficient](coefficients#coefficientpreprocessing-information-with-text-variables) value.


## Export the Word Cloud {: #export-the-word-cloud }

You can export Word Cloud insights as raw values in a CSV file. To export, click the **Export** button and then **Download** in the resulting dialog.

![Click Export button](images/tut-wordcloud-8.png)

When the download is complete, open the CSV file.

![Export Word Cloud](images/tut-wordcloud-7.png)

Fields of the CSV are described below:

Column | Description
------ | -----------
`name` |  The word found in the column (in `var_name`).
`var_name` | Feature name (name of the column).
`resp` | Normalized coefficient from the linear model.
`freq` | Normalized word occurrences.
`abs_freq` | Total word occurrences (count).
`stop_word` | Whether stop words are filtered.

## Learn more {: #learn-more }

??? tip "How does DataRobot handle text features?"

    If a dataset contains one or more text features, DataRobot uses natural language processing (NLP) tools, such as Auto-Tuned Word N-Gram Text Modelers, to specially tune models and generate NLP visualization techniques, including frequency value tables and word clouds.

    During model building, DataRobot incorporates a matrix of word-grams in blueprints. The matrix is produced using common techniques, TF-IDF values, and a combination of multiple text columns.

    For large datasets, DataRobot uses the Auto-Tuned Word N-Gram Text Modelers tool, which looks at one text column at a time. This approach uses a single N-Gram model for each text feature in the input dataset, and then uses the predictions from these models as inputs for other models.


**Documentation:**

- [Additional details on Word Cloud insights](analyze-insights#word-cloud-insights)
- [Other text-based insights in DataRobot](analyze-insights#text-based-insights)
